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Abstract — Discuss several tricks for solving twenty question 
problems which in this paper is depicted as a guessing game. 
Player tries to find a ball in twenty boxes by asking as few 
questions as possible, and these questions are answered by only 
"Yes" or "No". With the discussion, demonstration of source 
coding methods is the main concern. 



I. Introduction 

Unit computation of mordern computer is still binary, while 
"Yes or No" question is a good illustration of such computing, 
asking one question is equivalent to spending one bit of 
computation resource. This discussion is intended to give an 
intution behind symbol source coding through discussing the 
different ways for solving a concrete twenty question problem. 

The rest of this paper is organized as follows. Section |ll] 
introduces the way of one-by-one asking. Section III is about 



top-down division. In Section IV we discuss the way of down- 
top merging. The work is concluded in Section |V] 
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Fig. 2. Illustration of One-by-One Asking 
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Fig. 3. Illustration of Top-Down Division 



II. One-bye-One Asking 

We depict the TQP(Twenty Question Problem) with 20 
boxes in which only one box contains a ball, shown as 
figure [T] With method one, we choose arbitraily one box 
and say it contain the ball, if opening the box and find 
there is none, equivalently answered by "No", we get in- 
formation content log || . Continuously we draw another box 
but miss the ball again, we get information content log y|. 
Step forward repeatedly, and assume the ball is found at step 
-^(1 < N < 20), up to now the total information content we 
got is (log f§ + log if + . . . + log + = 

log f = 4.32196its). 



Fig. 1. Only one of twenty boxes includes a ball 

Without loss of generality, the guessing process is illustrated 
as choosing the boxes in order from left to right, shown as 
figure [2] For every guessing, we have "Yes" or "No" results. 
Imagine that 1 bit is spent for every guessing. Then the 
expected bits need solving the TQP with the One-by-One 



III. Top-DowN Division 
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Before every asking we divede equally the boxes into two 
groups, then ask if the ball is in one of the two groups. Accord- 
ing to the answer continue this strategy repeatly until the ball 
is found. This division process is shown as figurelS] In this way 
the expected bits to spend is (I + I + I + I + IxtI^^ AAbits). 
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The information content gotten from this ways is + 
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IV. Down-Top Merging 



The smartest way presented here is to merge the options in 
Down-Top direction, which follows Huffman Coding method 
[1]. Every box has the same probability ^ to contain the ball, 
combine two of the boxes and imagine they become a bigger 
one, then the probability of the ball in this bigger box is 
For every merging we make sure that the two boxes (real 
or imagined box) have the smallest probability of including 
the ball. For example, after first merging we have one bigger 
box which has probability ^ and there are 18 boxes with 
probability so 9 bigger boxes should be formed from the 18 
boxes respectively. Repeat merging bigger boxes until we have 
a box which include the ball with probability 1. This merging 
process is shown as figure |4] From this process we have the 
spent bits is (l + i§ + (|^ x 2) + (^ x 5) + (|^ x 10) = 4Abits). 
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(a) Huffman D-T Merging 
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Fig. 5. Comparison between Huffman method and Greedy division 
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Fig. 4. Illustration of Down-Top Merging 



The information content gotten in this way is (1 x 

(A. loo- 20 , 12 1 20n , 12 |'_8_ 1 12 , J_ 1 12N , _8_ 
20 8 + 20 12 'I + 20 ^ 12 8 + 12 4 'I + 20 ^ 

1x2+2^x1x5+1^x1x10 = 4.32196its). 

V. Conclusion 

From above discussion, we can definitely conclude that to 
find the ball the three tricks get the same information content, 
but the first method consume in average much more extra 
effort than the later two methods. For TQP, the Top-Down 
Divsion method and Down-Top Merging method consume 
the same expected bits for achieving the goal. But they are 
not of the same efficiency. Actually the Down-Top Merging 
is optimal while Top-Down Divsion is sub-optimal, just Uke 
nuclear fusion has much more energy than nuclear fission. 

Theorem 1. For symbol coding, Huffman code is the optimal. 

Proof: Let symbol set Ax ~ {a;i,-- - ,xn} have 
'Px = {pir'' tPn}- Use division or merging method to 
construct codes for symbols, with once division or merging 
we have a new level. At any level / there are intermedi- 
ate symbols Ai — {ai,-- - ,Q!„j}(2 < n/ < N), and 
T^i = {Pir-- = !)■ With Huffman cod- 

ing method, at level / we merge two symbols ai and aj, 
Mk e {1, • • • , and fc i,k ^ j : Pk > Pi-,Pj- 
Then the bits consumed by this merge is 1 x [pi -\- Pj). With 
other code, at any level / if two symbols a^i and a^, merge 
into or are divided from (/ — 1) level. The consumed bits 
1 X [pk^ +Pk2) > 1 X (pi +Pj), if ki, k2 ^ ij. Sum all the 
bits consumed at all levels, we can get the Huffman code is 
the shortest. 

■ 

Take an example as figure |5] A symbol set with Vx = 
{|, i, g, Y^}, with Huffman merging we get expected code 
length (1 + X5 + 1^ ~ 1.87bits), while greedy division has 
expected code length (1 + 1 = 2bits). 



